--- title: Analysis of Star Column keywords: fastai sidebar: home_sidebar summary: "Objective: " description: "Objective: " nb_path: "02_analysis_star.ipynb" ---
df_sub.head(2)
Visialise word cloud of top 50 words of cleaned star field, this would probably give us a overall understanding of what are the most common words being used over the years for filling up Star column of lesson feedback.
In this we have skipped following words after initial trial as they are known to be present in the dataset and will not add much insight for us. Skipped words are:{"teacher","student","students","lesson","class","pupil","pupils"}
co_star_clean = ' '.join(df_sub['co_star_clean'].tolist())
plot_wc(co_star_clean)
plot_wc(verbs)
plot_wc(noun)
plot_wc(adj,20)
plot_wc(adv,20)
plt.figure(figsize=(10,15))
sns.barplot(df.frequency,df['bigram'])
plt.show()
visulaizeBigrams(bigram_df=df, K=12)
fig, axs = plt.subplots(10,5, figsize=(25, 40), facecolor='w', edgecolor='k')
fig.subplots_adjust(hspace = .5, wspace=.25)
axs = axs.ravel()
for i in range(len(df)):
bigram = df.bigram[i]
test_df = df_sub[df_sub['co_star'].str.contains(bigram)]
ser = test_df.groupby('co_lpm_follow').count()['cd_class']
try:
sns.barplot(x=ser.index,y=ser.values,ax = axs[i])
except:
pass
axs[i].set_title(bigram)
plot_multi_wc(len(years),wcs,texts)
texts = ["Top 20 VERBS words of {}".format(i) for i in years]
plot_multi_wc(len(years),verbs,texts)
texts = ["Top 20 NOUNS words of {}".format(i) for i in years]
plot_multi_wc(len(years),nouns,texts)
texts = ["Top 20 Adjectives words of {}".format(i) for i in years]
plot_multi_wc(len(years),adjs,texts)
texts = ["Top 20 Adverbs words of {}".format(i) for i in years]
plot_multi_wc(len(years),advs,texts)
texts = ["Top 50 Bigrams/Trigram of {}\n".format(i) for i in years]
plot_bigram(bigrams,texts)
texts = ["Network of Bigrams/Trigram of {}".format(i) for i in years]
visulaizeBigrams_multi(bigrams,texts,K=10)